High-performance streaming applications are becoming a new and distinct domain of programs with the widespread usage of streaming service in Internet. Many research has been done to provide efficient support for streaming programs. On the other hand, Mapreduce has proven successful in the massive-data processing domain such as search engines. It is unknown whether the Mapreduce paradigm can be used to accelerate streaming processes.

This work assesses the viability of streaming process in Hadoop, which is an open source implementation of Mapreduce. While keeping the mapreduce interface intact, we have successfully de-coupled the Hadoop Distributed File System(HDFS) and the internal computation. Moreoever, concurrent map and reduce tasks make possible the dynamic adjustment of computation according to the changing characteristics of the input. This work is the first research step to provide an efficient, scalable platform for streaming applications. During the implementation of this project, the authors have gained intensive hands-on experience with Hadoop source code.